The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus
نویسندگان
چکیده
This paper describes the interconnection network used in the Cray T3E multiprocessor. The network is a bidirectional 3D torus with fully adaptive routing, optimized virtual channel assignments, integrated barrier synchronization support and considerable fault tolerance. The routers are built with LSI’s 500K ASIC technology with custom transmitters/ receivers driving low-voltage differential signals at 375 MHz, for a link data payload capacity of approximately 500 MB/s.
منابع مشابه
Parallel Rendering of 3D AMR Data on the SGI/Cray T3E
This paper describes work-in-progress on developing parallel visualization strategies for 3D Adaptive Mesh Refinement (AMR) data. AMR is a simple and powerful tool for modeling many important scientific and engineering problems. However, visualization tools for 3D AMR data are not generally available. Converting AMR data onto a uniform mesh would result in high storage requirements, and renderi...
متن کاملAnalysis of fault-tolerant routing algorithms in k-ary n-cube networks
The success of large-scale multicomputers is highly dependent on the efficiency of their underlying interconnection networks. K-ary n-cubes have been one of the most popular networks for multicomputers due to their desirable properties, such as ease of implementation and ability to reduce message latency by exploiting communication locality found in many parallel applications. The two most comm...
متن کاملPerspectives in Performance Evaluation of Large ATM Networks
The demand for broadband communication is becoming more and more urging. In order to achieve a successful expand of the ATM technology, performance evaluation of large networks is a mandatory step and the challenge is now to simulate them in one piece. Distributed simulation is a promising tool for the simulation of such systems that can not be handled sequentially anymore. This paper presents ...
متن کاملParallel Performance of a 3D Elliptic Solver
It was recently shown that block-circulant preconditioners applied to a conjugate gradient method used to solve structured sparse linear systems arising from 2D or 3D elliptic problems have good numerical properties and a potential for high parallel efficiency. In this note parallel performance of a circulant block-factorization based preconditioner applied to a 3D model problem is investigated...
متن کاملGlobal Address Space, Non-Uniform Bandwidth: A Memory System Performance Characterization of Parallel Systems
Many parallel systems offer a simple view of memory: all storage cells are addressed uniformly. Despite a uniform view of the memory, the machines differ significantly in their memory system performance (and may offer slightly different consistency models). Cached and local memory accesses are much faster than remote read accesses to data generated by another processor or remote write to data i...
متن کامل